Truncated Normal distribution (truncnorm)#
A truncated normal is what you get when a normal random variable is restricted to an interval. Formally, it’s a normal distribution conditioned on lying between a lower bound and an upper bound.
It’s a natural model whenever the untruncated normal story is reasonable (additive noise, central-limit behavior), but the variable is physically or logically constrained.
1) Title & classification#
Item |
Value |
|---|---|
Name |
Truncated normal ( |
Type |
Continuous |
Support |
\(x \in [\ell, u]\) with \(\ell < u\) (bounds may be \(\pm\infty\)) |
Parameters |
location \(\mu\in\mathbb{R}\), scale \(\sigma>0\), bounds \(\ell<u\) |
SciPy shape params |
\(a = (\ell-\mu)/\sigma\), \(b=(u-\mu)/\sigma\) with \(a<b\) |
A compact way to write the model is:
What you’ll be able to do after this notebook#
write the PDF/CDF of a truncated normal and map parameters to SciPy
compute mean/variance/skewness/kurtosis (and understand why the mean shifts)
interpret how \((\mu,\sigma,\ell,u)\) change the shape
derive the log-likelihood and fit \((\mu,\sigma)\) by MLE when truncation is known
sample from a truncated normal using NumPy only (rejection sampling)
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
import os
import plotly.io as pio
from scipy import optimize, special, stats
pio.templates.default = "plotly_white"
pio.renderers.default = os.environ.get("PLOTLY_RENDERER", "notebook")
np.set_printoptions(precision=4, suppress=True)
rng = np.random.default_rng(42)
2) Intuition & motivation#
What it models#
A truncation happens when values outside a range are impossible to observe (or are removed from the population). For example, imagine an underlying measurement model:
but the process can only produce values in \([\ell,u]\). Then the observed variable is the conditional distribution:
This is not the same as clipping (censoring) where out-of-range values are recorded at the boundary.
Typical real-world use cases#
Physical constraints: lengths, concentrations, or prices that cannot go negative (often \(\ell=0\)).
Quality control / selection bias: only items within spec are kept; others are discarded.
Psychometrics / surveys: scores restricted to a finite scale (e.g. 0–100).
Truncated regression: outcomes are observed only when they fall within a range (distinct from Tobit censoring).
Bayesian priors with constraints: a normal prior truncated to enforce bounds.
Relations to other distributions#
As \(\ell\to-\infty\) and \(u\to\infty\), the truncated normal becomes an ordinary normal.
One-sided truncation with \((\ell=0,\;u=\infty)\) yields a half-normal when \(\mu=0\) (up to reparameterization).
Truncation is a generic operation: many “bounded” distributions are best understood as conditional versions of simpler base models.
3) Formal definition#
Let \(\varphi\) and \(\Phi\) denote the standard normal PDF and CDF:
Define standardized truncation points
Here \(Z\) is the normalization constant (the probability that an untruncated normal lands in \([\ell,u]\)).
PDF#
For \(x\in[\ell,u]\):
CDF#
SciPy parameterization#
scipy.stats.truncnorm uses standardized bounds \((a,b)\) plus loc and scale:
So for known bounds \([\ell,u]\) you typically compute:
from scipy import stats
a = (lower - mu) / sigma
b = (upper - mu) / sigma
rv = stats.truncnorm(a, b, loc=mu, scale=sigma)
LOG_SQRT_2PI = 0.5 * np.log(2 * np.pi)
def norm_logpdf(z):
z = np.asarray(z, dtype=float)
return -0.5 * z**2 - LOG_SQRT_2PI
def norm_pdf(z):
return np.exp(norm_logpdf(z))
def norm_cdf(z):
# SciPy's ndtr is a fast and numerically stable normal CDF implementation
return special.ndtr(z)
def norm_logcdf(z):
# log Phi(z), stable for large negative z
return special.log_ndtr(z)
def _logdiffexp(log_a, log_b):
"""Compute log(exp(log_a) - exp(log_b)) for log_a >= log_b."""
return log_a + np.log1p(-np.exp(log_b - log_a))
def truncnorm_standardized_bounds(mu, sigma, lower, upper):
mu = float(mu)
sigma = float(sigma)
lower = float(lower)
upper = float(upper)
if sigma <= 0:
raise ValueError("sigma must be > 0")
if not (lower < upper):
raise ValueError("require lower < upper")
alpha = (lower - mu) / sigma
beta = (upper - mu) / sigma
return alpha, beta
def truncnorm_logZ(alpha, beta):
"""Compute log(Z) where Z = Phi(beta) - Phi(alpha)."""
alpha = np.asarray(alpha, dtype=float)
beta = np.asarray(beta, dtype=float)
if np.any(beta <= alpha):
raise ValueError("require beta > alpha")
# Direct CDF difference works well in the lower tail / central region.
logPhi_a = norm_logcdf(alpha)
logPhi_b = norm_logcdf(beta)
logZ_cdf = _logdiffexp(logPhi_b, logPhi_a)
# In the upper tail, it's often more stable to use survival functions:
# Phi(beta) - Phi(alpha) = sf(alpha) - sf(beta) = Phi(-alpha) - Phi(-beta).
logSf_a = norm_logcdf(-alpha)
logSf_b = norm_logcdf(-beta)
logZ_sf = _logdiffexp(logSf_a, logSf_b)
use_sf = alpha > 0
return np.where(use_sf, logZ_sf, logZ_cdf)
def truncnorm_logpdf(x, mu, sigma, lower, upper):
"""Log-PDF of TruncNorm(mu, sigma^2; [lower, upper])."""
x = np.asarray(x, dtype=float)
alpha, beta = truncnorm_standardized_bounds(mu, sigma, lower, upper)
z = (x - mu) / sigma
logZ = truncnorm_logZ(alpha, beta)
logpdf = norm_logpdf(z) - np.log(sigma) - logZ
return np.where((x >= lower) & (x <= upper), logpdf, -np.inf)
def truncnorm_pdf(x, mu, sigma, lower, upper):
return np.exp(truncnorm_logpdf(x, mu, sigma, lower, upper))
def truncnorm_cdf(x, mu, sigma, lower, upper):
"""CDF of TruncNorm(mu, sigma^2; [lower, upper])."""
x = np.asarray(x, dtype=float)
alpha, beta = truncnorm_standardized_bounds(mu, sigma, lower, upper)
z = (x - mu) / sigma
Z = norm_cdf(beta) - norm_cdf(alpha)
cdf = (norm_cdf(z) - norm_cdf(alpha)) / Z
cdf = np.where(x < lower, 0.0, cdf)
cdf = np.where(x > upper, 1.0, cdf)
return np.clip(cdf, 0.0, 1.0)
def truncnorm_moments(mu, sigma, lower, upper):
"""Mean/variance/skewness/excess kurtosis (using standardized raw moments)."""
alpha, beta = truncnorm_standardized_bounds(mu, sigma, lower, upper)
Z = norm_cdf(beta) - norm_cdf(alpha)
phi_a = norm_pdf(alpha)
phi_b = norm_pdf(beta)
# Standardized raw moments for T ~ N(0,1) | alpha <= T <= beta
m1 = (phi_a - phi_b) / Z
m2 = 1.0 + (alpha * phi_a - beta * phi_b) / Z
m3 = ((alpha**2 + 2.0) * phi_a - (beta**2 + 2.0) * phi_b) / Z
m4 = 3.0 + ((alpha**3 + 3.0 * alpha) * phi_a - (beta**3 + 3.0 * beta) * phi_b) / Z
var_t = m2 - m1**2
# Central moments (standardized)
mu3 = m3 - 3 * m1 * m2 + 2 * m1**3
mu4 = m4 - 4 * m1 * m3 + 6 * (m1**2) * m2 - 3 * m1**4
if var_t <= 0:
skew = np.nan
excess_kurt = np.nan
else:
skew = mu3 / (var_t ** 1.5)
excess_kurt = mu4 / (var_t**2) - 3.0
mean = mu + sigma * m1
var = (sigma**2) * var_t
return {
'alpha': alpha,
'beta': beta,
'Z': Z,
'mean': mean,
'variance': var,
'skewness': skew,
'excess_kurtosis': excess_kurt,
'm1': m1,
'm2': m2,
'm3': m3,
'm4': m4,
}
def truncnorm_entropy(mu, sigma, lower, upper):
"""Differential entropy (natural log)."""
alpha, beta = truncnorm_standardized_bounds(mu, sigma, lower, upper)
Z = norm_cdf(beta) - norm_cdf(alpha)
phi_a = norm_pdf(alpha)
phi_b = norm_pdf(beta)
return (
0.5 * np.log(2 * np.pi * np.e * sigma**2)
+ np.log(Z)
+ (alpha * phi_a - beta * phi_b) / (2 * Z)
)
def truncnorm_mgf(t, mu, sigma, lower, upper):
"""MGF M_X(t) for real t (exists for all real t since support is bounded)."""
t = np.asarray(t, dtype=float)
alpha, beta = truncnorm_standardized_bounds(mu, sigma, lower, upper)
Z = norm_cdf(beta) - norm_cdf(alpha)
num = norm_cdf(beta - sigma * t) - norm_cdf(alpha - sigma * t)
return np.exp(mu * t + 0.5 * (sigma * t) ** 2) * (num / Z)
def truncnorm_loglik(x, mu, sigma, lower, upper):
x = np.asarray(x, dtype=float)
return np.sum(truncnorm_logpdf(x, mu, sigma, lower, upper))
4) Moments & properties#
Let \(\alpha=(\ell-\mu)/\sigma\), \(\beta=(u-\mu)/\sigma\), and \(Z=\Phi(\beta)-\Phi(\alpha)\). Define the standardized truncated variable:
A convenient summary is:
Quantity |
Expression |
|---|---|
Mean |
\(\mathbb{E}[X]=\mu+\sigma\,\dfrac{\varphi(\alpha)-\varphi(\beta)}{Z}\) |
Variance |
\(\mathrm{Var}(X)=\sigma^2\Bigl(1+\dfrac{\alpha\varphi(\alpha)-\beta\varphi(\beta)}{Z}-\bigl(\dfrac{\varphi(\alpha)-\varphi(\beta)}{Z}\bigr)^2\Bigr)\) |
Skewness |
computed from standardized central moments of \(T\) (see below) |
Excess kurtosis |
computed from standardized central moments of \(T\) |
MGF |
\(M_X(t)=e^{\mu t+\tfrac12\sigma^2 t^2}\,\dfrac{\Phi(\beta-\sigma t)-\Phi(\alpha-\sigma t)}{Z}\) |
CF |
\(\varphi_X(\omega)=M_X(i\omega)\) (complex extension of \(\Phi\)) |
Entropy |
\(h(X)=\tfrac12\log(2\pi e\sigma^2)+\log Z+\dfrac{\alpha\varphi(\alpha)-\beta\varphi(\beta)}{2Z}\) |
Raw-moment formulas (standardized)#
For \(T\) as above:
From these, compute central moments and then skewness/kurtosis in the usual way.
mu, sigma, lower, upper = 0.5, 1.2, -1.0, 2.0
mom = truncnorm_moments(mu, sigma, lower, upper)
{
'mean': mom['mean'],
'variance': mom['variance'],
'skewness': mom['skewness'],
'excess_kurtosis': mom['excess_kurtosis'],
'entropy': truncnorm_entropy(mu, sigma, lower, upper),
'mgf(t=0.5)': truncnorm_mgf(0.5, mu, sigma, lower, upper),
}
{'mean': 0.5,
'variance': 0.6063036262023139,
'skewness': 0.0,
'excess_kurtosis': -0.9776753030120942,
'entropy': 1.0744134977913906,
'mgf(t=0.5)': 1.3838567001159119}
5) Parameter interpretation#
Think in two layers:
Underlying normal: \(Y\sim\mathcal{N}(\mu,\sigma^2)\)
Conditioning (truncation): keep only values in \([\ell,u]\)
How parameters affect shape:
\(\mu\) (location) shifts the underlying normal; after truncation, the actual mean is pulled toward the interval.
\(\sigma\) (scale) controls spread; if \(\sigma\) is large relative to \((u-\ell)\), most mass is chopped off and the shape can become highly skewed.
Bounds \((\ell,u)\) cut tails; narrow intervals force a near-uniform-looking bump (but still “normal-shaped” on the log scale).
In standardized coordinates, \((a,b)=((\ell-\mu)/\sigma,(u-\mu)/\sigma)\) are the bounds measured in standard deviations from \(\mu\).
# How truncation changes shape
def plot_pdf_family(param_sets, title):
fig = go.Figure()
for ps in param_sets:
mu, sigma, lower, upper, name = ps
x = np.linspace(
lower if np.isfinite(lower) else mu - 4 * sigma,
upper if np.isfinite(upper) else mu + 4 * sigma,
600,
)
y = truncnorm_pdf(x, mu, sigma, lower, upper)
fig.add_trace(go.Scatter(x=x, y=y, mode='lines', name=name))
fig.update_layout(title=title, xaxis_title='x', yaxis_title='pdf')
return fig
param_sets_1 = [
(0.0, 1.0, -2.0, 2.0, "mu=0, sigma=1, [-2,2]"),
(0.0, 1.0, 0.0, np.inf, "mu=0, sigma=1, [0,∞)"),
(0.0, 1.0, -0.5, 0.5, "mu=0, sigma=1, [-0.5,0.5]"),
(0.0, 1.0, 2.0, np.inf, "mu=0, sigma=1, [2,∞)"),
]
fig1 = plot_pdf_family(param_sets_1, "Truncated normal PDFs (varying bounds)")
fig1.show()
param_sets_2 = [
(-1.0, 1.0, -1.0, 1.0, "mu=-1, sigma=1, [-1,1]"),
(0.0, 1.0, -1.0, 1.0, "mu=0, sigma=1, [-1,1]"),
(1.0, 1.0, -1.0, 1.0, "mu=1, sigma=1, [-1,1]"),
]
fig2 = plot_pdf_family(param_sets_2, "PDFs with fixed bounds but different mu")
fig2.show()
6) Derivations#
Below we derive the mean/variance for the standardized truncated normal. Let
Expectation#
Use the key identity \(\varphi'(t)=-t\varphi(t)\), so \(t\varphi(t)=-\varphi'(t)\):
Then \(X=\mu+\sigma T\) gives \(\mathbb{E}[X]=\mu+\sigma\mathbb{E}[T]\).
Variance#
First compute \(\mathbb{E}[T^2]\):
Use integration by parts with \(u=t\) and \(dv=t\varphi(t)dt=-\varphi'(t)dt\):
So
Finally
Likelihood#
For observations \(x_1,\dots,x_n\in[\ell,u]\):
The log-likelihood is the sum over \(i\). Because \(Z=\Phi(\beta)-\Phi(\alpha)\) depends on \((\mu,\sigma)\), MLE typically requires numerical optimization.
def fit_truncnorm_mle(x, lower, upper, mu0=None, sigma0=None):
"""MLE for (mu, sigma) given known truncation bounds [lower, upper]."""
x = np.asarray(x, dtype=float)
lower = float(lower)
upper = float(upper)
if not np.all((x >= lower) & (x <= upper)):
raise ValueError("all observations must lie within [lower, upper]")
if mu0 is None:
mu0 = float(np.mean(x))
if sigma0 is None:
sigma0 = float(np.std(x, ddof=1))
sigma0 = max(sigma0, 1e-6)
def nll(theta):
mu = float(theta[0])
sigma = float(np.exp(theta[1]))
ll = truncnorm_loglik(x, mu, sigma, lower, upper)
if not np.isfinite(ll):
return 1e300
return -ll
res = optimize.minimize(
nll,
x0=np.array([mu0, np.log(sigma0)]),
method='L-BFGS-B',
)
mu_hat = float(res.x[0])
sigma_hat = float(np.exp(res.x[1]))
return mu_hat, sigma_hat, res
# Demo: recover parameters from synthetic data
mu_true, sigma_true, lower, upper = 0.5, 1.2, -1.0, 2.0
a = (lower - mu_true) / sigma_true
b = (upper - mu_true) / sigma_true
x = stats.truncnorm(a, b, loc=mu_true, scale=sigma_true).rvs(size=3000, random_state=rng)
mu_hat, sigma_hat, res = fit_truncnorm_mle(x, lower, upper)
(mu_true, sigma_true), (mu_hat, sigma_hat), res.success
((0.5, 1.2), (0.4995503501371839, 1.229687469851479), True)
7) Sampling & simulation (NumPy-only)#
A conceptually simple sampler is rejection sampling from the underlying normal.
Algorithm (accept–reject)#
Propose \(Y\sim\mathcal{N}(\mu,\sigma^2)\).
If \(Y\in[\ell,u]\), accept and output \(X=Y\). Otherwise reject and try again.
This works because the truncated normal is exactly the conditional distribution \(Y\mid(\ell\le Y\le u)\).
Efficiency#
The acceptance probability is
So the expected number of proposals per accepted sample is \(1/Z\). If \(Z\) is tiny (very narrow interval or far out in the tails), naive rejection can be slow.
def truncnorm_rvs_rejection_numpy(mu, sigma, lower, upper, size=1, rng=None, batch_multiplier=4):
"""Sample from TruncNorm(mu, sigma^2; [lower, upper]) using NumPy-only rejection sampling."""
if rng is None:
rng = np.random.default_rng()
mu = float(mu)
sigma = float(sigma)
lower = float(lower)
upper = float(upper)
if sigma <= 0:
raise ValueError("sigma must be > 0")
if not (lower < upper):
raise ValueError("require lower < upper")
size_tuple = (size,) if isinstance(size, int) else tuple(size)
n = int(np.prod(size_tuple))
if np.isneginf(lower) and np.isposinf(upper):
return rng.normal(loc=mu, scale=sigma, size=size_tuple), n
out = np.empty(n, dtype=float)
filled = 0
proposed = 0
while filled < n:
m = n - filled
batch = max(batch_multiplier * m, 128)
y = rng.normal(loc=mu, scale=sigma, size=batch)
proposed += y.size
acc = y[(y >= lower) & (y <= upper)]
k = min(acc.size, m)
if k > 0:
out[filled : filled + k] = acc[:k]
filled += k
return out.reshape(size_tuple), proposed
s, proposed = truncnorm_rvs_rejection_numpy(0.5, 1.2, -1.0, 2.0, size=50_000, rng=rng)
s.min(), s.max(), s.mean(), proposed / s.size
(-0.9997033556964805, 1.999994264038702, 0.49656530813860444, 4.0)
8) Visualization#
We’ll visualize:
the PDF and CDF
Monte Carlo samples (histogram + empirical CDF)
We’ll use the formulas from Sections 3–4 and the NumPy-only sampler from Section 7.
mu, sigma, lower, upper = 0.5, 1.2, -1.0, 2.0
x_grid = np.linspace(lower, upper, 600)
pdf = truncnorm_pdf(x_grid, mu, sigma, lower, upper)
cdf = truncnorm_cdf(x_grid, mu, sigma, lower, upper)
fig_pdf = px.line(x=x_grid, y=pdf, title="Truncated normal PDF", labels={"x": "x", "y": "pdf"})
fig_pdf.show()
fig_cdf = px.line(x=x_grid, y=cdf, title="Truncated normal CDF", labels={"x": "x", "y": "cdf"})
fig_cdf.show()
# Monte Carlo samples
samples, _ = truncnorm_rvs_rejection_numpy(mu, sigma, lower, upper, size=20_000, rng=rng)
fig_hist = px.histogram(
x=samples,
nbins=60,
histnorm='probability density',
title="Monte Carlo samples (histogram) with PDF overlay",
labels={"x": "x"},
)
fig_hist.add_trace(go.Scatter(x=x_grid, y=pdf, mode='lines', name='theory pdf'))
fig_hist.show()
# Empirical CDF vs theoretical CDF
xs = np.sort(samples)
ecdf = np.arange(1, xs.size + 1) / xs.size
fig_ecdf = go.Figure()
fig_ecdf.add_trace(go.Scatter(x=xs, y=ecdf, mode='lines', name='empirical CDF'))
fig_ecdf.add_trace(go.Scatter(x=x_grid, y=cdf, mode='lines', name='theory CDF'))
fig_ecdf.update_layout(title="Empirical CDF vs theoretical CDF", xaxis_title='x', yaxis_title='cdf')
fig_ecdf.show()
9) SciPy integration (scipy.stats.truncnorm)#
SciPy’s truncated normal is scipy.stats.truncnorm.
Key points:
aandbare standardized bounds (in units of standard deviations).locandscaleare the underlying normal’s \((\mu,\sigma)\).
Mapping from absolute bounds \([\ell,u]\):
Then:
rv = stats.truncnorm(a, b, loc=mu, scale=sigma)
If you call fit, SciPy estimates (a, b, loc, scale) directly.
If your truncation bounds are known and fixed in absolute units, you usually want a custom likelihood fit (Section 6), because (a,b) depend on \((\mu,\sigma)\) through the mapping above.
mu, sigma, lower, upper = 0.5, 1.2, -1.0, 2.0
a = (lower - mu) / sigma
b = (upper - mu) / sigma
rv = stats.truncnorm(a, b, loc=mu, scale=sigma)
# pdf/cdf/rvs
x_grid = np.linspace(lower, upper, 5)
rv.pdf(x_grid), rv.cdf(x_grid), rv.rvs(size=3, random_state=rng)
(array([0.193 , 0.3467, 0.4215, 0.3467, 0.193 ]),
array([0. , 0.2033, 0.5 , 0.7967, 1. ]),
array([1.424 , 1.7351, 0.2584]))
# Compare SciPy moments to our formulas
m_scipy, v_scipy, skew_scipy, kurt_scipy = rv.stats(moments='mvsk')
mom = truncnorm_moments(mu, sigma, lower, upper)
{
'mean_scipy': float(m_scipy),
'mean_formula': mom['mean'],
'var_scipy': float(v_scipy),
'var_formula': mom['variance'],
'skew_scipy': float(skew_scipy),
'skew_formula': mom['skewness'],
'excess_kurt_scipy': float(kurt_scipy),
'excess_kurt_formula': mom['excess_kurtosis'],
}
{'mean_scipy': 0.5,
'mean_formula': 0.5,
'var_scipy': 0.6063036262023138,
'var_formula': 0.6063036262023139,
'skew_scipy': 0.0,
'skew_formula': 0.0,
'excess_kurt_scipy': -0.9776753030120942,
'excess_kurt_formula': -0.9776753030120942}
# Fit example (SciPy parameterization)
samples = rv.rvs(size=5000, random_state=rng)
a_hat, b_hat, loc_hat, scale_hat = stats.truncnorm.fit(samples)
{
'true': (a, b, mu, sigma),
'fit': (a_hat, b_hat, loc_hat, scale_hat),
}
{'true': (-1.25, 1.25, 0.5, 1.2),
'fit': (-1.1095450525138721,
0.9161971056921887,
0.6441090041345641,
1.4780312277747587)}
10) Statistical use cases#
A) Hypothesis testing#
When data are truncated, using “ordinary” normal-model tests can be badly biased. A principled approach is to write down the truncated likelihood and use standard likelihood tools:
Likelihood ratio tests (LRT)
Wald tests (via the observed Fisher information)
Parametric bootstrap (often more accurate in small samples)
B) Bayesian modeling#
A truncated normal is a common way to build a normal prior with hard constraints. With a normal likelihood, the untruncated posterior is normal; adding bounds yields a truncated normal posterior.
C) Generative modeling#
Truncated normals are used whenever you want “Gaussian-like” randomness but want to avoid extreme outliers. A famous example is truncated normal weight initialization in deep learning, where weights are drawn from a normal distribution but clipped by truncation (e.g. within \(\pm2\sigma\)).
# A) Likelihood ratio test for H0: mu = mu0 (bounds known)
lower, upper = -1.0, 2.0
mu_true, sigma_true = 0.6, 1.1
mu0 = 0.0
a = (lower - mu_true) / sigma_true
b = (upper - mu_true) / sigma_true
data = stats.truncnorm(a, b, loc=mu_true, scale=sigma_true).rvs(size=2500, random_state=rng)
# Unrestricted MLE
mu_hat, sigma_hat, _ = fit_truncnorm_mle(data, lower, upper)
ll_alt = truncnorm_loglik(data, mu_hat, sigma_hat, lower, upper)
# Null: mu fixed, optimize sigma only
def fit_sigma_given_mu(x, mu_fixed, lower, upper, sigma0=None):
x = np.asarray(x, dtype=float)
if sigma0 is None:
sigma0 = float(np.std(x, ddof=1))
sigma0 = max(sigma0, 1e-6)
def nll(log_sigma):
sigma = float(np.exp(log_sigma))
ll = truncnorm_loglik(x, mu_fixed, sigma, lower, upper)
if not np.isfinite(ll):
return 1e300
return -ll
res = optimize.minimize_scalar(nll, bracket=(np.log(sigma0) - 1.0, np.log(sigma0) + 1.0))
return float(np.exp(res.x)), res
sigma0_hat, _ = fit_sigma_given_mu(data, mu0, lower, upper)
ll_null = truncnorm_loglik(data, mu0, sigma0_hat, lower, upper)
lr_stat = 2 * (ll_alt - ll_null)
p_value = stats.chi2.sf(lr_stat, df=1)
{
'mu_hat': mu_hat,
'sigma_hat': sigma_hat,
'sigma_hat_under_H0': sigma0_hat,
'lr_stat': lr_stat,
'p_value_chi2_approx': p_value,
}
{'mu_hat': 0.5615952358558073,
'sigma_hat': 1.023327322118709,
'sigma_hat_under_H0': 1.862372609449451,
'lr_stat': 166.35630655358545,
'p_value_chi2_approx': 4.6241821384229533e-38}
# B) Bayesian modeling: truncated-normal prior + normal likelihood
# Parameter theta is constrained to [0, 1]
lower, upper = 0.0, 1.0
# Prior: theta ~ TruncNorm(mu0, sigma0^2; [0,1])
mu0, sigma0 = 0.5, 0.25
# Data: y_i | theta ~ Normal(theta, sigma_y^2)
sigma_y = 0.10
theta_true = 0.7
n = 30
y = rng.normal(loc=theta_true, scale=sigma_y, size=n)
# Untruncated Normal-Normal posterior parameters
post_var = 1.0 / (1.0 / sigma0**2 + n / sigma_y**2)
post_mu = post_var * (mu0 / sigma0**2 + n * y.mean() / sigma_y**2)
post_sigma = float(np.sqrt(post_var))
# With bounds, posterior is truncated normal with the same (post_mu, post_sigma) but restricted to [0,1]
x_grid = np.linspace(lower, upper, 600)
prior_pdf = truncnorm_pdf(x_grid, mu0, sigma0, lower, upper)
post_pdf = truncnorm_pdf(x_grid, post_mu, post_sigma, lower, upper)
fig = go.Figure()
fig.add_trace(go.Scatter(x=x_grid, y=prior_pdf, mode='lines', name='prior'))
fig.add_trace(go.Scatter(x=x_grid, y=post_pdf, mode='lines', name='posterior'))
fig.update_layout(title='Prior vs posterior (both truncated normal)', xaxis_title='theta', yaxis_title='density')
fig.show()
{
'y_mean': float(y.mean()),
'post_mu_untruncated': post_mu,
'post_sigma_untruncated': post_sigma,
'posterior_mean_truncated': truncnorm_moments(post_mu, post_sigma, lower, upper)['mean'],
}
{'y_mean': 0.7145105676253577,
'post_mu_untruncated': 0.7133725805292019,
'post_sigma_untruncated': 0.018208926018230744,
'posterior_mean_truncated': 0.7133725805292019}
# C) Generative modeling: truncated normal for "Gaussian-like" randomness without extreme outliers
sigma = 1.0
n = 200_000
w_normal = rng.normal(loc=0.0, scale=sigma, size=n)
w_trunc = stats.truncnorm(-2.0, 2.0, loc=0.0, scale=sigma).rvs(size=n, random_state=rng)
{
'max_abs_normal': float(np.max(np.abs(w_normal))),
'max_abs_trunc': float(np.max(np.abs(w_trunc))),
}
{'max_abs_normal': 4.673029245895192, 'max_abs_trunc': 1.9999682329410746}
# Compare tails visually
x_grid = np.linspace(-4, 4, 800)
pdf_normal = stats.norm(loc=0, scale=1).pdf(x_grid)
pdf_trunc = stats.truncnorm(-2.0, 2.0, loc=0.0, scale=1.0).pdf(x_grid)
fig = go.Figure()
fig.add_trace(go.Scatter(x=x_grid, y=pdf_normal, mode='lines', name='Normal(0,1)'))
fig.add_trace(go.Scatter(x=x_grid, y=pdf_trunc, mode='lines', name='TruncNorm(0,1;[-2,2])'))
fig.update_layout(title='Tail behavior: normal vs truncated normal', xaxis_title='x', yaxis_title='pdf')
fig.show()
11) Pitfalls#
Truncation vs censoring: truncation removes out-of-range values; censoring records them at the boundary.
SciPy parameter confusion:
aandbare standardized; the actual support is(loc + a*scale, loc + b*scale).Numerical issues: when \(Z=\Phi(\beta)-\Phi(\alpha)\) is tiny, naive subtraction can underflow to 0. Use log-CDF computations (or SciPy’s implementation).
Slow rejection sampling: when \(Z\) is tiny, accept–reject becomes inefficient; consider inverse-CDF sampling or specialized tail samplers.
Fitting with known bounds: SciPy’s
fitestimates(a,b,loc,scale); if you have fixed absolute bounds, use a custom likelihood (Section 6).
# SciPy parameterization pitfall: (a, b) are in *standard deviations*, not absolute units
mu, sigma = 0.5, 0.2
lower, upper = 0.0, 1.0
rv_wrong = stats.truncnorm(0.0, 1.0, loc=mu, scale=sigma)
rv_right = stats.truncnorm((lower - mu) / sigma, (upper - mu) / sigma, loc=mu, scale=sigma)
{
'wrong_support': rv_wrong.support(),
'right_support': rv_right.support(),
}
{'wrong_support': (0.5, 0.7), 'right_support': (0.0, 1.0)}
12) Summary#
truncnormis a normal distribution conditioned to lie in \([\ell,u]\).The PDF is the normal PDF divided by the probability mass \(Z=\Phi(\beta)-\Phi(\alpha)\) in the interval.
Truncation shifts the mean toward the interval and can induce strong skewness.
Likelihood-based inference should use the truncated likelihood; naive normal-model procedures can be biased.
A simple NumPy-only sampler is accept–reject from \(\mathcal{N}(\mu,\sigma^2)\).
In SciPy,
aandbare standardized bounds:a=(lower-mu)/sigma,b=(upper-mu)/sigma.